Hierarchical Multi-Attention Transfer for Knowledge Distillation

نویسندگان

چکیده

Knowledge distillation (KD) is a powerful and widely applicable technique for the compression of deep learning models. The main idea knowledge to transfer from large teacher model small student model, where attention mechanism has been intensively explored in regard its great flexibility managing different teacher-student architectures. However, existing attention-based methods usually similar intermediate layers neural networks, leaving hierarchical structure representation poorly investigated distillation. In this paper, we propose multi-attention framework (HMAT), types are utilized at levels Specifically, position-based channel-based characterize low-level high-level feature representations respectively, activation-based both mid-level representations. Extensive experiments on three popular visual recognition tasks, image classification, retrieval, object detection, demonstrate that proposed or HMAT significantly outperforms recent state-of-the-art KD methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

hierarchical functional concepts for knowledge transfer among reinforcement learning agents

this article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for reinforcement learning agents. these definitions are used as a tool of knowledge transfer among agents. the agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. in other words, the agents are assumed t...

متن کامل

Multi-Resolution Learning for Knowledge Transfer

Related objects may look similar at low-resolutions; differences begin to emerge naturally as the resolution is increased. By learning across multiple resolutions of input, knowledge can be transfered between related objects. My dissertation develops this idea and applies it to the problem of multitask transfer learning.

متن کامل

Knowledge Transfer for Multi-labeler Active Learning

In this paper, we address multi-labeler active learning, where data labels can be acquired from multiple labelers with various levels of expertise. Because obtaining labels for data instances can be very costly and time-consuming, it is highly desirable to model each labeler’s expertise and only to query an instance’s label from the labeler with the best expertise. However, in an active learnin...

متن کامل

Hierarchical Multi-scale Attention Networks for action recognition

Recurrent Neural Networks (RNNs) have been widely used in natural language processing and computer vision. Among them, the Hierarchical Multi-scale RNN (HM-RNN), a kind of multi-scale hierarchical RNN proposed recently, can learn the hierarchical temporal structure from data automatically. In this paper, we extend the work to solve the computer vision task of action recognition. However, in seq...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Multimedia Computing, Communications, and Applications

سال: 2022

ISSN: ['1551-6857', '1551-6865']

DOI: https://doi.org/10.1145/3568679